首页> 外文OA文献 >An MDP model-based reinforcement learning approach for production station ramp-up optimization: Q-learning analysis

【2h】

An MDP model-based reinforcement learning approach for production station ramp-up optimization: Q-learning analysis

机译：基于MDP模型的强化学习方法，用于生产站的产能优化：Q学习分析

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Ramp-up is a significant bottleneck for the introduction\udof new or adapted manufacturing systems. The effort\udand time required to ramp-up a system is largely dependent on\udthe effectiveness of the human decision making process to select\udthe most promising sequence of actions to improve the system to\udthe required level of performance. Although existing work has\udidentified significant factors influencing the effectiveness of rampup,\udlittle has been done to support the decision making during\udthe process. This paper approaches ramp-up as a sequential\udadjustment and tuning process that aims to get a manufacturing\udsystem to a desirable performance in the fastest possible time.\udProduction stations and machines are the key resources in a\udmanufacturing system. They are often functionally decoupled\udand can be treated in the first instance as independent rampup\udproblems. Hence, this paper focuses on developing a Markov\uddecision process (MDP) model to formalize ramp-up of production\udstations and enable their formal analysis. The aim is to\udcapture the cause-and-effect relationships between an operator’s\udadaptation or adjustment of a station and the station’s response to\udimprove the effectiveness of the process. Reinforcement learning\udhas been identified as a promising approach to learn from rampup\udexperience and discover more successful decision-making\udpolicies. Batch learning in particular can perform well with little\uddata. This paper investigates the application of a Q-batch learning\udalgorithm combined with an MDP model of the ramp-up process.\udThe approach has been applied to a highly automated production\udstation where several ramp-up processes are carried out. The\udconvergence of the Q-learning algorithm has been analyzed\udalong with the variation of its parameters. Finally, the learned\udpolicy has been applied and compared against previous ramp-up\udcases.

机译：升级是引入新的或改装的制造系统的重大瓶颈。增强系统所需的精力和时间在很大程度上取决于人工决策过程选择/最有希望的行动序列以提高系统以使其达到所需性能水平的有效性。尽管现有工作已经确定了影响提升效率的重要因素，但是已经做了一些努力来支持过程中的决策。本文将加速作为一个顺序\调整和调整过程，旨在使制造\ ud系统在尽可能短的时间内达到理想的性能。\ ud生产工位和机器是\ ud制造系统中的关键资源。它们通常在功能上是解耦的\ ud，并且可以首先将其视为独立的加速\ ud问题。因此，本文着重于开发马尔可夫决策过程（MDP）模型，以正式化生产\停工量并对其进行形式化分析。目的是\了解运营商对站点的适应或调整与站点响应之间的因果关系，以\证明提高过程的有效性。强化学习已经被认为是一种有前途的学习方法，可以从加强学习，经验学习中发现更多成功的决策制定方法。批处理学习尤其可以在\ uddata很少的情况下表现良好。本文研究了Q批次学习\ udalgorithm结合斜率上升过程的MDP模型的应用。\ ud该方法已应用于高度自动化的生产\减数，其中执行了几种斜率上升过程。 Q学习算法的\收敛性已经随着其参数的变化进行了分析。最后，已经应用了学习\ udpolicy并将其与以前的升级\ udcase进行了比较。

著录项

作者
Doltsinis, Stefanos; Ferreira, Pedro; Lohse, Niels;
展开▼
作者单位

展开▼
年度 2014
总页数
原文格式 PDF
正文语种 en
中图分类

相似文献

外文文献
中文文献
专利

1. An MDP Model-Based Reinforcement Learning Approach for Production Station Ramp-Up Optimization: Q-Learning Analysis [J] . Doltsinis S., Ferreira P., Lohse N. IEEE Transactions on Systems, Man, and Cybernetics . 2014,第9期

机译：基于MDP模型的强化学习平台用于生产站升级优化：Q学习分析
2. Reinforcement Learning-Based Load Forecasting of Electric Vehicle Charging Station Using Q-Learning Technique [J] . Dabbaghjamanesh Morteza, Moeini Amirhossein, Kavousi-Fard Abdollah IEEE transactions on industrial informatics . 2021,第6期

机译：基于Q学习技术的电动汽车充电站的加固载荷预测
3. Adaptive visual tracking using the prioritized Q-learning algorithm: MDP-based parameter learning approach [J] . Sarang Khim, Sungjin Hong, Yoonyoung Kim, Image and Vision Computing . 2014,第12期

机译：使用优先级Q学习算法的自适应视觉跟踪：基于MDP的参数学习方法
4. Reinforcement Learning for Production Ramp-Up: A Q-Batch Learning Approach [C] . Doltsinis Stefanos, Ferreira Pedro, Lohse Niels ICMLA 2012;International Conference on Machine Learning and Applications . 2012

机译：强化学习以提高生产效率：Q批次学习方法
5. On Deep Reinforcement Learning for Games: Generalization of Deep Q-Learning with Multiple Policy Heads [D] . Boucher, Mathieu. 2020

机译：关于游戏的深度加固学习：多重政策头部深度Q学的泛化
6. Design Optimization of a Pneumatic Soft Robotic Actuator Using Model-Based Optimization and Deep Reinforcement Learning [O] . Mahsa Raeisinezhad, Nicholas Pagliocca, Behrad Koohbor, 2021

机译：基于模型的优化和深度加固学习的气动软机器人执行器设计优化
7. Model-based analysis and optimization of an ISPR approach using reactive extraction for pilot scale L-Phenylalanine production [O] . Takors R. 2004

机译：基于模型的ISPR方法的分析和优化，采用反应萃取技术进行中试规模L-苯丙氨酸生产

An MDP model-based reinforcement learning approach for production station ramp-up optimization: Q-learning analysis

摘要

著录项

相似文献

相关主题

期刊订阅